Skip to content
This repository has been archived by the owner on Aug 20, 2024. It is now read-only.

File Serialization of Annotations #1277

Merged
merged 21 commits into from
Aug 11, 2020
Merged

File Serialization of Annotations #1277

merged 21 commits into from
Aug 11, 2020

Conversation

seldridge
Copy link
Member

@seldridge seldridge commented Dec 10, 2019

This adds an annotation mix-in, CustomFileEmission, that lets an annotation be serialized to a file as opposed to included in the output annotation JSON. An annotation then controls:

  1. How it should be serialized to a string (method toBytes)
  2. Which file to put this string in (methods baseFileName, suffix, filename)
  3. How to replace the file-serialized annotation with some other annotations that let downstream stages know about the serialized file (method replacements)

The stage-global phase WriteOutputAnnotations is modified to handle CustomFileEmission annotations such that it writes their contents to the requested file and replaces them with their replacements. This means that the FIRRTL Stage-specific Phase WriteEmitted for writing emitted annotations is no longer necessary. This is thereby deprecated. A consequence of this is that emitted annotations are no longer deleted. This simplifies the reconstruction of the deprecated FirrtlExecutionResult because the deleted emitted annotation no longer has to be examined.

This migrates EmittedAnnotation and its children, e.g., EmittedFirrtlCircuitAnnotation to use this new file serialization approach. Other candidates for changing include blackbox helpers, memlib, and memory initialization.

This includes the following minor updates:

  • WriteOutputAnnotationsSpec is updated to check the above logic of CustomFileEmission
  • VerilogExecution is modified to use the execute method (which includes Stage-global Phases) as opposed to the run method (which does not). This is necessary so that WriteOutputAnnotations runs.

I would propose that this fixes #791, but would like @hcook feedback on this approach. As a fix for #791, any transform that wants to write a file needs to emit an annotation that mixes in CustomFileEmission. The default Phases common to every Stage take care of the rest.

Checklist

  • Did you specify the type of improvement?
  • Did you state the API impact?
  • Did you specify the code generation impact?
  • Did you request a desired merge strategy?

TODO

Type of Improvement

  • code refactoring
  • new feature/API

API Impact

Adds the CustomFileEmission trait and changes the way file serialization works for EmittedAnnotations. Removes WriteEmitted.

Backend Code Generation Impact

No Verilog/FIRRTL generation changes.

Desired Merge Strategy

  • Squash: The PR will be squashed and merged (choose this if you have no preference.

@seldridge seldridge force-pushed the how-to-serialize branch 3 times, most recently from 25f02c0 to be36f7e Compare December 10, 2019 17:49
@seldridge seldridge added this to the 1.3.0 milestone Dec 10, 2019
@seldridge seldridge added API Addition (no impact on existing code) API Modification bugfix Merge Commit Please This branch has useful history, please merge with a merge commit labels Dec 10, 2019
@seldridge seldridge marked this pull request as ready for review December 10, 2019 17:56
@seldridge seldridge requested a review from a team as a code owner December 10, 2019 17:56
@seldridge seldridge requested a review from jackkoenig December 10, 2019 19:59
@seldridge
Copy link
Member Author

Feedback from @jackkoenig in the last dev meeting was that this would probably be better to serialize to a Stream. I think this means, this should have a toStream(): Stream[Char] method implemented. @jackkoenig: does that seem reasonable?

Also, what this PR proposes should be sufficient to replace Rocket Chip's ElaborationArtefacts. E.g., instead of emitting "something else" that's serialize to a file, you emit an annotation that defines how it should be serialized and what file it should be serialized to. Notably ElaborationArtefacts also has the ability to check if the file that someone is trying to generate would already be written to by someone else. The thing generating the artefact can then spin to generate a new name. This motivates having the ability to force uniqueness of generated files.

@jackkoenig
Copy link
Contributor

jackkoenig commented Dec 23, 2019

I think this means, this should have a toStream(): Stream[Char] method implemented. @jackkoenig: does that seem reasonable?

That's the right idea, although I'm not certain if Stream[Char] is the right type. It may work perfectly but because Stream (which is deprecated in Scala 2.13 in favor of LazyList) is effectively a List, it may incur a large performance penalty per element, in this case, Char. It might be that a Stream[String] is more appropriate.

I have to admit that when I originally said "stream" I was thinking about like Java OutputStream, but that's obviously mutable so not the most desirable API. I'm now wondering how this kind of thing is accomplished in the Scala functional world (ZIO, Scalaz, Cats, etc.). Is there a standard approach?

Before we finalize the API, I think we should see if there's an obvious approach out there, and then do some benchmarking. Assuming Stream* is the right approach, we should compare Stream[Char] and Stream[String]. While my intuition is that Stream[Char] has a large overhead, I can also see the JIT making that overhead basically zero so it might actually be even better in practice than Stream[String].

*Unfortunately it looks like LazyList is missing from the compatibility collections library

@seldridge
Copy link
Member Author

From the dev meeting discussion:

  • The underlying serialization strategy should be Stream[Byte]
  • Utility methods/mix-ins should exist for making this more friendly, e.g., a serialization defined to String]
  • twitter.chill.meatlocker may provide an easy way to serialize objects

seldridge and others added 21 commits August 11, 2020 16:18
Switch from using FirrtlStage.transform to FirrtlStage.run in one
test. The latter is problematic as it doesn't include wrappers or
pre/post phases which are how things will work in the future for doing
file writing (via HowToSerialize ideas).

Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
Changes the FirrtlStage method in FIRRTL testing infrastructure from
"run" (which does not include Stage-global Phases) to "execute" (which
does).

Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
This adds an Annotation mix-in, HowToSerialize, that allows an
annotation to declare how it should be serialized to a file. The
mix-in is abstract in a baseFileName and a suffix (used to generate a
filename), a howToSerialize method (defining the string contents of
the file), and a howToResume method (that defines a replacement for
the file-serialized annotation that allows this to be resumed) [^1].

A default implementation for generating a filename (called filename)
is defined that will put the baseFileName+suffix file in the target
directory. This can be overridden by the annotation if desired.

[^1]: When an annotation is serialized to a file, it should be removed
from the emitted JSON-serialized annotations. The howToResume method
defines a way of adding replacement annotations to the JSON-serialized
annotations that tell a downstream tool how to find the serialized
file. E.g., if a FIRRTL circuit is written to a file, this could be
used to add a FirrtlFileAnnotation defining the location of the new
file.

Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
This extends firrtl.options.phase.WriteOutputAnnotations to serialize
HowToSerialize annotations to files.

Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
This adds tests of the HowToSerialize mix-in inside the
WriteOutputAnnotationsSpec.

Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
This migrates EmittedAnnotations (and its children) to mixin the
HowToSerialize trait. This enables this annotations to be
automatically written to files via WriteOutputAnnotations

Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
This converts the HowToSerialize trait to use a Stream[Char] when
defining how an annotation should be serialized.

Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
Change the type of the HowToSerialize.howToSerialize method from a
stream to an iterable. Using the latter (the superset of both lazy
streams and non-lazy things like String) avoids problems with users
having to choose laziness when they already have an eager object.

In effect, this makes the API more general.

Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
Add a default implementation of CustomFileEmission.replacements.

Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
Change the type of CustomFileEmission.replacements from
Option[AnnotationSeq] to AnnotationSeq. The latter has all the
properties of the former that I'm trying to express here: (1) can
emptiness and (2) monadicity (if the AnnotationSeq is converted to a
sequence first). The latter property is exploited in the
WriteOutputAnnotations phase to concisely flatMap over the annotations
and doing the double-monad is unnecessary.

Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
Change the API of CustomFileEmission to use a final def for the actual
filename. The baseFileName is then made a method with an AnnotationSeq
parameter to allow the filename to change as a function of other
annotations, e.g., by an output circuit annotation.

By restricting this API, we have more control over the default
behavior of where things are written using the fixed behavior of the
filename method---files will always be written using the behavior that
StageOptions define. Previously, if users want customized behavior,
they would need to duplicate this StageOptions functionality (and
likely subtly deviate from the standard behavior and introduce
problems with their build).

Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
Set behavior of file conflicts in CustomFileEmission to be the
following: No two annotations in the same annotation sequence can
serialize to the same file during the WriteOutputAnnotations phase.
However, if the output annotation file already exists, it will be
overwritten.

Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
Change FirrtlOptions.getBuildFileName to simply serialize the
underlying Java File instead of converting this to its canonical path.
This should improve the relocatability of files produced by the
CustomFileEmission API.

Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
Normalize paths inside the getBuildFileName utility of StageOptions.
Add a check to prevent a null pointer dereference.

Co-authored-by: Jack Koenig <koenig@sifive.com>
Co-authored-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
Co-authored-by: Jack Koenig <koenig@sifive.com>
Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
Co-authored-by: Jack Koenig <koenig@sifive.com>
Co-authored-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
Signed-off-by: Schuyler Eldridge <schuyler.eldridge@ibm.com>
@seldridge
Copy link
Member Author

@jackkoenig: This is ready for a re-review. I've updated the companion Chisel PR and it is passing locally for me.

@@ -92,17 +95,35 @@ final case class EmittedFirrtlModule(name: String, value: String, outputSuffix:
final case class EmittedVerilogModule(name: String, value: String, outputSuffix: String) extends EmittedModule

/** Traits for Annotations containing emitted components */
sealed trait EmittedAnnotation[T <: EmittedComponent] extends NoTargetAnnotation with Unserializable {
sealed trait EmittedAnnotation[T <: EmittedComponent] extends NoTargetAnnotation with CustomFileEmission {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why, but I didn't notice that this was just using the API here. Pretty slick :)

Copy link
Contributor

@jackkoenig jackkoenig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice work and thanks for the persistence on this!

@seldridge seldridge added the Please Merge Accepted PRs that are ready to be merged. Useful when waiting on CI. label Aug 11, 2020
@mergify mergify bot merged commit 8bdbbac into master Aug 11, 2020
@seldridge seldridge deleted the how-to-serialize branch August 12, 2020 20:39
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
API Addition (no impact on existing code) API Modification bugfix Merge Commit Please This branch has useful history, please merge with a merge commit Please Merge Accepted PRs that are ready to be merged. Useful when waiting on CI.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Directory for Transform metadata? Prune/Don't Emit Output Annotation Sequence
4 participants